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Abstract — Passive monitoring utilizing distributed wireless sniffers is an effective technique to monitor activities in wireless infrastruc- 
ture networks for fault diagnosis, resource management and critical path analysis. In this paper, we introduce a quality of monitoring 
(QoM) metric defined by the expected number of active users monitored, and investigate the problem of maximizing QoM by judiciously 
assigning sniffers to channels based on the knowledge of user activities in a multi-channel wireless network. Two types of capture 
models are considered. The user-centric model assumes frame-level capturing capability of sniffers such that the activities of different 
users can be distinguished while the sniffer-centric model only utilizes the binary channel information (active or not) at a sniffer. For 
the user-centric model, we show that the implied optimization problem is NP-hard, but a constant approximation ratio can be attained 
via polynomial complexity algorithms. For the sniffer-centric model, we devise stochastic inference schemes to transform the problem 
into the user-centric domain, where we are able to apply our polynomial approximation algorithms. The effectiveness of our proposed 
schemes and algorithms is further evaluated using both synthetic data as well as real-world traces from an operational WLAN. 

Index Terms — Wireless network, mobile computing, approximation algorithm, binary independent component analysis. 
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Deployment and management of wireless infrastructure 
networks (WiFi, WiMax, wireless mesh networks) are 
often hampered by the poor visibility of PHY and 
MAC characteristics, and complex interactions at various 
layers of the protocol stacks both within a managed 
network and across multiple administrative domains. 
In addition, today's wireless usage spans a diverse set 
of QoS requirements from best-effort data services, to 
VOIP and streaming applications. The task of managing 
the wireless infrastructure is made more difficult due 
to the additional constraints posed by QoS sensitive 
services. Monitoring the detailed characteristics of an 
operational wireless network is critical to many system 
administrative tasks including, fault diagnosis, resource 
management, and critical path analysis for infrastructure 
upgrades. 

Passive monitoring is a technique where a dedicated 
set of hardware devices called sniffers, or monitors, are 
used to monitor activities in wireless networks. These 
devices capture transmissions of wireless devices or 
activities of interference sources in their vicinity and 
store the information in trace files, which can be an- 
alyzed distributively or at a central location. Wireless 
monitoring [1 J, |2J, |3j, ISJ, |5| has been shown to comple- 
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ment wire side monitoring using SNMP and basestation 
logs since it reveals detailed PHY (e.g., signal strength, 
spectrum density) and MAC behaviors (e.g, collision, 
retransmissions), as well as timing information (e.g., 
backoff time), which are often essential for wireless 
diagnosis. The architecture of a canonical monitoring 
system consists of three components: 1) sniffer hardware, 
2) sniffer coordination and data collection, and 3) data 
processing and mining. 

Depending on the type of networks being monitored 
and hardware capability, sniffers may have access to 
different levels of information. For instance, spectrum 
analyzers can provide detailed time- and frequency- 
domain information. However, due to the limit of band- 
width or lack of hardware / software support, it may not 
be able to decode the captured signal to obtain frame 
level information on the fly. Commercial-off-the-shelf 
network interfaces such as WiFi cards on the other hand, 
can only provide frame level information^] The volume 
of raw traces in both cases tends to be quite large. For 
example, in the study of the UH campus WLAN, 4 
million MAC frames have been collected per sniffer per 
channel over an 80-minute period resulting in a total of 
8 million distinct frames from four sniffers. Furthermore, 
due to the propagation characteristics of wireless signals, 
a single sniffer can only observe activities within its 
vicinity. Observations of sniffers within close proxim- 
ity over the same frequency band tend to be highly 
correlated. Therefore, two pertinent issues need to be 
addressed in the design of passive monitoring systems: 
1) what to monitor, and 2) how to coordinate the sniffers 

1. Certain chip sets and device drivers allow inclusion of header 
fields to store a few physical layer parameters in the MAC frames. 
However, such implementations are generally vendor and driver de- 
pendent. 
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to maximize the amount of captured information. 

This paper assumes a generic architecture of pas- 
sive monitoring systems for wireless infrastructure net- 
works, which operate over a set of contiguous or non- 
contiguous channels or bands^] To address the first ques- 
tion, we consider two categories of capturing models 
differed by their information capturing capability. The 
first category, called the user-centric model, assumes avail- 
ability of frame-level information such that activities of 
different users can be distinguished. The second category 
is the sniffer-centric model which only assumes binary 
information regarding channel activities, i.e., whether 
some user is active in a specific channel near a sniffer. 
Clearly, the latter imposes minimum hardware require- 
ments, and incurs minimum cost for transferring and 
storing traces. In some cases, due to hardware con- 
straints (e.g., in wide-band cognitive radio networks) or 
security /privacy considerations, decoding of frames to 
extract user level information is infeasible and thus only 
binary sniffer information might be available for surveil- 
lance purpose. We further characterize theoretically the 
relationship between the two models. 

Ideally, a network administrator would want to per- 
form network monitoring on all channels simultane- 
ously. However, multi-radio sniffers are known to be 
large and expensive to deploy |6|. We therefore assume 
sniffers in our system are low-cost devices which can 
only observe one single wireless channel at a time. 
To maximize the amount of captured information, we 
introduce a quality-of-monitoring (QoM) metric defined 
as the total expected number of active users detected, 
where a user is said to be active at time t, if it transmits 
over one of the wireless channels. The basic problem 
underlying all of our models can be cast as finding an 
assignment of sniffers to channels so as to maximize the 
QoM. QoM is an important metric that quantifies the 
efficiency of monitoring solutions to systems where it is 
important to capture as comprehensive information as 
possible (e.g.: intrusion/anomaly detection 0, JH1 and 
diagnosing systems 0, $W\ ). 

We note that the problem of sniffer assignment, in an 
attempt to maximize the QoM metric, is further compli- 
cated by the dynamics of real-life systems such as: 1) the 
user population changes over time (churn), 2) activities 
of a single user is dynamic, and 3) connectivity between 
users and sniffers may vary due to changes in channel 
conditions or mobility. These practical considerations re- 
veal the fundamental intertwining of "learning", where 
the usage pattern of wireless resources is to be estimated 
online based on captured information, and "decision 
making", where sniffer assignments are made based on 
available knowledge of the usage pattern. In fact, in 
our earlier work ITTTI . we prove that during learning, 
each instance of the decision making is equivalent to 
solving an instance of the sniffer assignment problem 

2. A channel can be a single frequency band, a code in CDMA 
systems, or a hopping sequence in frequency hopping systems. 



with the parameters properly chosen. Thus, effective and 
efficient algorithms for the sniffer assignment problem is 
critical. In this paper, we focus on designing algorithms 
that aim at maximizing the QoM metric with different 
granularities of a priori knowledge. The usage patterns 
are assumed to be stationary during the decision period. 

Our Contribution: In this paper, we make the following 
contributions toward the design of passive monitoring 
systems for multi-channel wireless infrastructure net- 
works 

• We provide a formal model for evaluating the qual- 
ity of monitoring. 

• We study two categories of monitoring models that 
differ in the information capturing capability of pas- 
sive monitoring systems. For each of these models 
we provide algorithms and methods that optimize 
the quality of monitoring. 

• We unravel interactions between the two monitoring 
models by devising two methods to convert the 
sniffer-centric model to the user-centric domain by 
exploiting the stochastic properties of underlying 
user processes. 

More specifically, we show that in both the user- and 
sniffer-centric models considered, a pure strategy where 
a sniffer is assigned to a single channel suffices in 
order to maximize the QoM. In the user-centric model, 
we show that our problem can be formulated as a 
covering problem. The problem is proven to be NP-hard, 
and constant-approximation polynomial algorithms are 
provided. With the sniffer-centric model, we show that 
although the only information retrieved by the sniffers 
is binary (in terms of channel activity), the "structure" 
of the underlying processes is retained and can be 
recovered. Two different approaches are proposed that 
utilize the notion of Independent Component Analysis 
(ICA) [12] and allow mapping the sniffer assignment 
problem to the user-centric model. The first approach, 
Quantized Linear ICA (QLlCA), estimates the hidden 
structure by applying a quantization process on the out- 
come of the traditional ICA, while the second approach, 
Binary ICA (bICA) [13], decomposes the observation 
data into OR mixtures of hidden components and re- 
covers the underlying structure. Finally, an extensive 
evaluation study is carried out using both synthetic data 
as well as real-world traces from an operational WLAN. 

The paper is organized as follows. An overview of 
related work is provided in Section [2] In Section |3j we 
formally introduce the QoM metric and the user-centric 
and sniffer-centric models for a passive monitoring sys- 
tem. The NP-hardness and polynomial-time algorithms 
for the maximum effort coverage problem that underlies 
two variants of the user-centric model are discussed 
in Section |4] The relationship between the user-centric 
and sniffer-centric models is established in Section [5] 
where we also describe two schemes for solving the QoM 
problem under the sniffer-centric model. We present the 
results of the evaluation study using both synthetic and 
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real traces in Section [6] We discuss issues regarding 
practical system implementation in Section [7] and finally 
conclude the paper in Section [8] 

2 Related Work 

In this section, we provide an overview of related work 
pertaining to wireless network monitoring, and binary 
independent component analysis. 

Wireless monitoring: There has been much work done 
on wireless monitoring from a system-level approach, in 
an attempt to design complete systems, and address 
the interactions among the components of such systems. 
The work in IfjH, 03 uses AP, SNMP logs, and wired 
side traces to analyze WiFi traffic characteristics. Passive 
monitoring using multiple sniffers was first introduced 
by Yeo et al. in JT|, EJ/ where the authors articulate the 
advantages and challenges posed by passive measure- 
ment techniques, and discuss a system for performing 
wireless monitoring with the help of multiple sniffers, 
which is based on synchronization and merging of the 
traces via broadcast beacon messages. The results ob- 
tained for these systems are mostly experimental. Rodrig 
et al. in [3] used sniffers to capture wireless data, and 
analyze the performance characteristics of an 802.11 WiFi 
network. One key contribution was the introduction of a 
finite state machine to infer missing frames. The Jigsaw 
system, that was proposed in [4|, focuses on large scale 
monitoring using over 150 sniffers. 

A number of recent works focused on the diagnosis of 
wireless networks to determine causes of errors. In [16], Chan- 
dra et al. proposed WiFiProfiler, a diagnostic tool that 
utilizes exchange of information among wireless hosts 
about their network settings, and the health of network 
connectivity. Such shared information allows inference 
of the root causes of connectivity problems. Building 
on their monitoring infrastructure, Jigsaw, Cheng et al. 
|TP7| developed a set of techniques for automatic char- 
acterization of outages and service degradation. They 
showed how sources of delay at multiple layers (physical 
through transport) can be reconstructed by using a com- 
bination of measurements, inference and modeling. Qiu 
et al. in [18] proposed a simulation based approach to 
determine sources of faults in wireless mesh networks 
caused by packet dropping, link congestion, external 
noise, and MAC misbehavior. 

All the afore-mentioned work focuses on building 
monitoring infrastructure, and developing diagnosis 
techniques for wireless networks. The question of opti- 
mally allocating monitoring resources to maximize cap- 
tured information remains largely untouched. In |19|, 
Shin and Bagchi consider the selection of monitoring 
nodes and their associated channels for monitoring wire- 
less mesh networks. The optimal monitoring is formu- 
lated as maximum coverage problem with group budget 
constraints (denoted MC-GBC), which was previously 
studied by Chekuri and Kumar in [20 1. The user-centric 



model results in a problem formulation that is similar 
to (albeit different from) the one addressed in [19 1. 
On one hand, we assume all sniffers may be used for 
monitoring (hence parting with our problem being akin 
to the classical maximum-coverage problem, while on 
the other hand we focus on the weighted version of the 
problem, where elements to be covered have weights. 
One should note that all the lower bounds mentioned 
in EDI , 1T9 1 do not apply to our problem. 

Binary independent component analysis: Binary ICA 
is a special variant of the traditional ICA, where linear 
mixing of continuous signals is assumed. In binary ICA, 
boolean mixing (e.g., OR, XOR etc.) of binary signals 
is considered. Existing solutions to binary ICA mainly 
differ in their assumptions of prior distribution of the 
mixing matrix, noise model, and /or hidden causes. In 
[21], Yeredor considers binary ICA in XOR mixtures 
and investigates the identifiability problem. A deflation 
algorithm is proposed for source separation based on en- 
tropy minimization. In (21] the number of independent 
random sources K is assumed to be known. Further- 
more, the mixing matrix is a K-by-K invertible matrix. 
In 11221 , an infinite number of hidden causes following 
the same Bernoulli distribution is assumed. Reversible 
jump Markov chain Monte Carlo and Gibbs sampler 
techniques are applied. In contrast, in our model, the 
hidden causes may follow different distributions. Streith 
et al. [23] study the problem of multi-assignment cluster- 
ing for boolean data, where an object is represented by 
a boolean attribute vector. The key assumption made in 
this work is that elements of the observation matrix are 
conditionally independent given the model parameters. 
This greatly reduces the computational complexity and 
makes the scheme amenable to gradient descent opti- 
mization solution; however, the assumption is in general 
invalid. In EH . the problem of factorization and de- 
noising of binary data due to independent continuous 
sources is considered. The sources are assumed to be 
following a beta distribution and not binary. Finally, [22] 
considers the under-represented case of less sensors than 
sources with continuous noise, while [24], [23] deal with 
the over-determined case, where the number of sensors 
is much larger than the number of sources. 

3 Problem formulation 

3.1 Notation and network model 

Consider a system of m sniffers, and n users, where 
each user u operates in one of K channels, c(u) E JC = 
{1,...,K}. The users can be wireless (mesh) routers, 
access points or mobile users. At any point in time, a 
sniffer can only monitor packet transmissions over a sin- 
gle channel. We assume the propagation characteristics 
of all channels are similar. We represent the relationship 
between users and sniffers using an undirected bi-partite 
graph Q = (S, U, E), where S is the set of sniffer nodes 
and U is the set of users. Note that Q represents a 
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general relationship between the users and sniffers, and 
no propagation or coverage model is assumed. An edge 
e = (s, u) G E exists between sniffer s e S and user 
u E U if s can capture the transmission from u, or 
equivalently, u is within the monitoring range of s. If 
transmissions from a user cannot be captured by any 
sniffer, the user is excluded from Q. For every vertex 
v € U U S, we let N(v) denote vertex v's neighbors in 
Q. For users, their neighbors are sniffers, and vice versa. 
We will also refer to G as the binary m X n adjacency 
matrix of graph Q. 

We will consider sniffer assignments of sniffers to chan- 
nels, a : S K. Given a sniffer assignment a, we 
consider a partitioning of the set of sniffers S = UfcLi ^ fc ' 
where Sk is the set of sniffers assigned to channel k. We 
further consider the corresponding partition of the set of 
users U = UfcLi where is the set of users operating 
in channel k. Let Gk — {Sk,Uk, Ef.) denote the bipartite 
subgraph of Q induced by channel k. Given any sniffer 
s, we let N k (s) — N(s) n Uk, i.e., the set of neighboring 
users of s that use channel k. 

A monitoring strategy determines the channel(s) a snif- 
fer monitors. It could be a pure strategy, i.e., the channel 
a sniffer is assigned to is fixed, or a mixed strategy 
where sniffers choose their assigned channel in each 
slot according to a certain distribution. Formally, let 
A = {a|a:5— S-/C} be the set of all possible assign- 
ments. Let 7r : A — ► [0, 1] be a probability distribution 
over the set of sniffer assignments. We refer to such a 
distribution as a mixed strategy. A pure strategy that 
selects a single channel per sniffer is a special case of 
mixed strategies, namely, n(a) = 1. It follows that the 
pure strategy is generally suboptimal comparing to the 
mixed strategy. However, as shown in the next section, 
the optimal solution can be obtained using just a pure 
strategy. 

In this paper we consider the problem of finding the 
monitoring strategy that maximizes QoM, defined as 
the expected number of users detected given the sniffer 
assignments. The main notations used in this paper are 
summarized in Table Q] 

3.2 Models for Observing User Access Patterns 

In this section, two categories of parametric models are 
proposed to describe the observability of usage patterns. 
We assume time is separated into slots, where each slot 
represents a fixed duration of time. A user is active if 
there exists a transmission event from the user during 
the slot time. In the experiments, slot time is chosen to 
be on the same order of maximum packet transmission 
time. Furthermore, we assume all channel and users' 
statistics remain stationary for the monitoring period of 
T time slots. 

User-centric model: First, we consider transmission 
events in the network from the user's viewpoint. We 
assume that G is known by inspecting the packet header 
information from each sniffer's captured traces. 



TABLE 1: Notations 



m, n, 


number of sniffers, users, 


K,T 


channels, and observations 


Q 


bi-partite graph representing 
user and sniffer adjacency 


S 


set of sniffers nodes in Q 


u 


set of users in Q 


A 


set of all possible sniffer-channel assignments 


X m X 1 


vector of m binary random variables 
from m sniffers 


ynxi 


vector of n binary random variables 
from n users 




collection or T observations or x 


YnxT 


collection of T observations of y 


Gm x n 


binary adjacency matrix of Q 


Plxn 


active probability vector of n users 


c(w) 


active channel of user u 


A{u) 


sniffer assignments that can monitor user u 


iv{a) 


probability distribution of assignment a 
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Fig. 1: A toy example. Users are shown in white circles and 
sniffers are shown in black circles. Sniffer range indicates 
weather or not a sniffer can capture a user's transmissions. 

In the user-centric model, the transmission probabili- 
ties of the users p = {p u \u e U} are known and assumed 
to be independent |^] p u denotes the transmission prob- 
ability of user u. p u and G can be estimated by putting 
all sniffers in the same channel and iterating through all 
possible channels for sufficiently long time. Each user 
process may be IID or non-IID over time. 

Consider a wireless network with 2 sniffers and 2 
users on 2 channels (Figure [TJ. User u\ and u 2 are 
active on channels 1 and 2, respectively. Transmission 
probabilities of users are p\ = 0.2 and p 2 = 0.5. User- 
centric model assumes G and p = {pi,P2} are available. 
Note that the maximum value of QoM in the above 
network is 0.7 attained when Si and s 2 are assigned to 
channels 1 and 2, respectively. 

Sniffer-centric model: The user-centric model requires 
detailed knowledge of each user's activities. This neces- 
sitates frame-level capturing capability by the passive 
monitoring system. In the sniffer-centric model, only 
binary information (on or off) of the channel activity at 
each sniffer is observed. 

We denote by the binary vector of observations 

3. The assumption that user activities are independent has been 
widely adopted in literature, examples are |25| and |26[. In the simu- 
lation evaluation in Section [6] the proposed algorithms are shown to 
perform well even when sucn independency is violated. 
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when all sniffers operate on channel k and by X& 
the collection of T realizations of Xfc. We assume that 
sniffers observations on different channels are indepen- 
dent. However, dependency exists among observations 
of sniffers operating in the same channel (as a result of 
transmissions made by the same set of users). Given an 
assignment a, a complete characterization of the sniffers' 
observations is given by the joint probability distribution 
V a {^k), k = 1,...,K. Here, V a {x.k) is implicitly depen- 
dent on the assignment a such that if sniffer i is not 
assigned to the fc'th channel, its binary observation Xfe(i) 
is always zero. By independence of different channels we 
haveP a (x) =nf = iPa(x t ). 

Consider again the network in Figure 1. Over T time 
slots, we have two observation matrices X\ and X 2 
at the same dimension (2 x T) corresponding to the 
activities on two channels. The first and second line in 
each matrix contain observations from sniffers s\ and 
S2, respectively. Sniffer-centric model assumes only the 
availability of X\ and X 2 , while G and p are unknown. 

Clearly, the sniffer-centric model is not as expressive 
as the user-centric model (formally characterized in Sec- 
tion [57TJ. However, it has the advantage of being based 
on aggregated statistics, which are likely to remain sta- 
tionary in the presence of moderate user-level dynamics, 
such as joining and leaving the networks, or changes 
in transmission activities (e.g., busy or thinking time). 
Furthermore, obtaining such binary information is less 
costly in both hardware requirements and communica- 
tion/ storage complexity. 

4 QoM under the User-Centric Model 

Under the user-centric model the goal is to maximize the 
expected number of active users monitored. Recall that 
p u is the transmission probability of user it. This problem 
can be formulated formally by: 

max EueuPuEaeAiu)^) 

s.t. 7r(a)e[0,l] C 1 ) 

where A{u) is the set of assignments that monitors 
user u, i.e., A(u) = {a \ 3s e N(u) s.t. a(s) = c(u)}. The 
objective function calculates the opportunity for all users 
to be monitored given the probability of each assignment 
(QoM). It can be written as, 

""fa) X! Pu ■ haeA(u)} , (2) 
aeA ueu 

where I^.y is an indicator function. From Eq. ||2} it is clear 
that a pure strategy can be adopted and is optimal, i.e., 
an optimal assignment is given by 

a* = arg max ^ P« ' haeA(u)} ■ (3) 
ueu 



4.1 MAX-EFFORT-COVERAGE problem 

Under the user-centric model, the objective to find the 
sniffer-channel assignment that can monitor the largest 
(weighted) set of users, subject to the constraint that 
each sniffer can only monitor one of the K channels 
at a time. We henceforth refer to the problem as MAX- 
EFFORT-COVERAGE (MEC) problem. Note that in MEC 
the weights can in fact be any non-negative values and 
are not limited to [0,1]. The MEC problem can be cast as 
the following integer program (IP): 

max J2ueuPuVu 

s t EfcLi z s,k < 1 VseS 

Vu < E s£ iV(«) z b,c{u) VueU (4) 

y u < 1 Vu G U 

Uu,z s ,k e {0,1} Vu,s,k. 

Each sniffer is associated with a set of binary decision 
variables, z s ,k = 1 if the sniffer is assigned to channel k; 
0, otherwise. y u is a binary variable indicating whether or 
not user u is monitored, and p u is the weight associated 
with user u. The objective function characterizes the 
number of (weighted) users that can be monitored with 
assignment z. 

One should first note that the problem is trivial if 
K = 1, since all sniffers would simply be assigned to 
the sole available channel. We can therefore assume that 
K > 2. The MEC problem can be viewed as a special 
case of the MC-GBC (mentioned in Section [2}, where 
all sniffers are used. One should note that previous 
hardness results for MC-GBC (both NP-hardness, as well 
as hardness of approximation) were based on a reduction 
to the standard maximum coverage problem. It follows 
that none of these proofs are applicable to the MEC 
problem. Surprisingly, there has not been any work done 
explicitly on the MEC problem, which seems to be a 
natural and important variant of the maximum coverage 
problem. 

4.2 Hardness of MEC 

In what follows we show that the MEC problem is NP- 
hard for K > 2, even for the unweighted case (i.e., where 
p u = 1 for all u G U). The hardness of the MEC problem 
actually follows from the choices available to the differ- 
ent sniffers. It is inherently different from the hardness 
suggested for the MC-GBC problem, which follows from 
limiting the number of sniffers one is allowed to use. 
We prove hardness of MEC using a reduction from 
the problem of Monotone-3SAT (MON3SAT), which is 
known to be NP-hard (see (27], ESI). In MON3SAT we 
are given as input an instance of 3SAT where every 
clause consists of either solely positive variables, or 
solely negated variables. The goal is to decide whether or 
not there exists an assignment which satisfies all clauses. 

In 11291 , we proved that the the unweighted MEC 
problem is NP-hard, even for K — 2. The result implies 
that one would have to settle for approximate solutions 
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to MEC. We first note that Guruswami and Khot show 
in ©J that MON3SAT is NP-hard to approximate within 
a factor of 7/8 + e for every e > 0. The following is a 
corollary of the above fact: 

Corollary 1: The MEC problem is NP-hard to approx- 
imate to within a factor of 7/8 + e for every e > 0. 

4.3 Algorithms for MEC 

Since MEC is a special case of the MC-GBC problem, we 
can use the available approximation algorithms for MC- 
GBC (e.g., ||20| , 03D) to solve our problem in the user- 
centric model. In what follows we give a brief overview 
of the algorithms we use. 

The Greedy algorithm: The Greedy algorithm itera- 
tively assigns sniffers to users, where at each step it 
chooses the sniffer and the assignment that (locally) max- 
imizes the weight of coverage of those not yet monitored 
users. 

It is proven in J 20 J that in the unweighted case, i.e., 
where all users have the same weight, Greedy guaran- 
tees to produce a \ -approximate solution, and that this is 
tight. The following theorem shows that the same holds 
also for the weighted case, which generalizes the MEC 
problem. 

Theorem 2: Greedy is a | -approximation algorithm for 
the weighted MC-GBC problem. 

LP-based algorithm: This algorithm is based on solving 
the LP-relaxation of the IP formulation for MEC appear- 
ing in Q. Once we have an optimal solution to the LP- 
relaxation, we round the fractional solution into an integral 
solution, with e.g., the probabilistic rounding technique 
of Srinivasan [31]. We next sketch the basic idea of this 
probabilistic rounding technique. Let z* be an optimal 
solution to the LP relaxation of Q, and let s be any 
sniffer. If J2 k z* c > 0, one can view the induced solution 
z* : C — > [0, 1] as a probability measure over the different 
channels (via normalization). The goal is to decide on an 
integral channel assignment for s, namely, setting each 
z* c to a value in {0, 1} such that exactly one variable out 
of the k variables corresponding to sniffer s is set to the 
value 1 . The algorithm builds a binary tree whose leaves 
corresponds to the k variables z s fe associated with sniffer 
s, and pairs unset variables in a bottom-up fashion. 
The pairing is made such that an internal node sets at 
least one of the variables corresponding to its children. 
This is done while adjusting the (probability) value of 
the (other) unset variable. This approach is proven to 
produce a valid assignment in linear time J 31 J. We refer 
to the above algorithm as ProbRand. 

Theorem 3: ProbRand is a (1 — 1/e) -approximation al- 
gorithm for the weighted MC-GBC problem. 

We note that the approximation guarantee of the LP- 
based algorithms are best possible for the MC-GBC prob- 
lem. However, this lower bound does not necessarily 
hold for the MEC problem. 



5 qom under the sniffer-centric 
Model 

The user-centric model is more expressive than the 
sniffer-centric model, which assumes the availability of 
the binary observation matrix X only. However, we will 
show in this section the two models are intrinsically 
connected by devising algorithms to infer G and p from 
X. 

Recall that in the sniffer-centric model, given an as- 
signment a G A, IlfcLi ^(xfc) is the probability dis- 
tribution of binary observations from m sniffers. Let 
iw(xfc) be the number of active users captured by sniffers 
in channel k given sniffer observations Xfc. The MEC 
problem under the sniffer-centric model is defined as 
follows. 

max Eae^ a )Ef=i E N x fc)] 

It 7T(o) G [0, 1] ( 5 ) 

The expectation is with respective to ■po(xfe). QoM in 
sniffer-centric model can be explained as the expected 
number of active users captured on all channels given 
the assignment probabilities. Clearly, a pure strategy 
suffices, i.e., there exists an optimal assignment such 
that, 

K 

a* = argmax [to(xfc)l. (6) 

fe=l 

Even with pure strategies, the optimization problem 
defined in |5} is still challenging to solve directly. The 
main difficulty arises from the evaluation of E[iu(xfc)]. 
Given xfc, one cannot decide how many users are active. 
Consider two scenarios. In the first case, two users are 
observed by two sniffers respectively. In the second case, 
a single user is observed by both sniffers. From bi- 
nary observations alone, one cannot distinguish the two 
cases, which correspond to different number of active 
users. Furthermore, in contrast to the user-centric model, 
where transmission activities from different users are 
independent, observations of sniffers are correlated. As 
a result, V a i^k) cannot be simplified as a product form. 
This motivates us to exploit the underlying (though not 
directly observable) independence among users, and map 
the optimization problem in sniffer-centric model to QoM 
under the user-centric model. 

In the sniffer-centric model, each sniffer only reports 
binary output regarding the activities in the channel 
currently monitored by that sniffer, and thus the access 
probability of the users as well as the bipartite graph Q, 
are both hidden. Recall that G refers to the adjacency 
binary matrix of Q. We first derive the sufficient and 
necessary conditions for unraveling the transmission 
probabilities of the users given G and 'P(x). 

Let y = {t/i, y-z, . . . ,y n } T be a vector of n binary 
random variables, where yj = 1 if user j transmits in 
its associated channel, and yj = otherwise. y k is the 
vector of activities for users transmitting on channel k 



7 



© 

o 



(D© 

© 



© 
© 

© 



- ® 

© 

© 




X\ 

x 5 



2/l 2/2 2/3 2/4 2/5 2/6 2/7 2/8 2/9 2/10 

Fig. 2: A sample network scenario with number of sniffers 
m — 5, number of users n — 10, its bipartite graph transfor- 
mation and its matrix representation. White circles represent 
independent users, black circles represent sniffers and dashed 
lines illustrate sniffers' coverage range. 



(i.e., users in £4). The joint distribution of y is given by 
P(y) = Uy ]= iPj Uy^o^-Pj)- The product form is due 
to the independence among users' activities. The main 
question we aim to answer is: given the vector x& of 
sniffers' observations, what knowledge can be obtained 
regarding y fc ? Throughout this section, unless otherwise 
specified, we limit the discussion to users and sniffers in 
a fixed channel k, and drop the subscript. We will also 
denote by gij the entry in the i'th row and j'th column 
of C. 

Using the adjacency matrix, and using A to represent 
Boolean AND and V to represent Boolean OR, we have 
the following: 



= V 9ij A Vh i = 1 > 



,m, 



(7) 



i.e., Xi = 1 iff there exists a user j within the range of 
sniffer i (c/ij = 1) that transmits (j/j = 1). Define the 
set y(x) = {y | \l n J=1 gi 3 Ayj = Xi,Vi}, i.e., the set of 
user activity profiles that are consistent with the sniffers' 
observations. Therefore, 



P(x)=7>(yer(x))= ]T V(y) 
yeF(x) 



(8) 



An example network with sniffers and users, the corre- 
sponding bipartite graph, and its matrix representation 
G are given in Figure [2] 

5.1 Relationship between the user-centric and 
sniffer-centric models with known G and unknown p 

The necessary and sufficient conditions that uniquely 
determine p using G and V(x) is characterized in the 
following theorem. 

Theorem 4: Given Q = (S : U, E), p can be uniquely 

determined by V(x) iffVuj ^ u r G U, N{ Uj ) ^ N(u f ). 



Proof: 

It is easy to see that the necessary condition holds. 
If two users have the same set of sniffer neighbors, 
unless packet headers are analyzed, their activities 
cannot be distinguished. 

To prove the sufficient condition, we construct a 
procedure to determine p from sniffer's joint dis- 
tribution 'P(x). 

Case 1: First, we consider a more restrictive con- 
straint, namely, Vuj ^ Uji G U, N(uj) % N(uj>). We 
let cjj denote the j'th column of the adjacency matrix 
G, i.e., a binary vector of length m. In other words, 
<7j is the coverage vector of the j'th sniffer. Since 
Vwj 7^ Uji G U, N(uj) % N(iij<), we have gjUgj> ^ gj 
and gj U gj> ^ g.y. From |8}, we have 

V(x = g j )= Pj J] 0--PA 

Since 7 , (x = 0) = n^et/ (•"■ — Pi) ( recan by our abuse 
of notation that is the all-zero vector of length m), 
we have 



Pj 



V(x = 9i) 



P(x = ft )+P(x 



(9) 



Case 2: Now we consider the case when the con- 
dition Muj ^ Uji G U, N(uj) % N(uj>) is violated. 
Without loss of generality, assume only one such 
pair exist and N(v,j') C N(v,j) (analysis for 

more complicated cases follows the same line of 
arguments). In this case, pj/ can be derived as in 
the previous case. However, equation Q does not 
hold for user j any more since if x = gj, user j' may 
or may not be active. More specifically, 



n*=9i)=Pi n o- 



Pj' 



j"eu,j"^j,i' 



Therefore, we have 



Pi 



P(x = ft )(l- ft v)+P(x 



(10) 



(11) 



In other words, the active probability of users can be 
computed by considering the users with the smallest 
degree in G first, and then applying (IT) iteratively 
in ascending order of node degree. 

□ 

The above theorem essentially shows that in the 
sniffer-centric model, if G is known, then one can ef- 
fectively determine the transmission probabilities of the 
users. In presence of measurement noise, methods such 
as Expectation-Maximization can be applied. We there- 
fore obtain an instance of the problem corresponding to 
our user-centric model, which can be solved efficiently 
using the algorithms described in Section 4.3 



Comment: Though Theorem |4] requires the users be 
connected to different sets of sniffers, violation of the 
condition would not affect the channel assignment. For 
example, when two users u and v are connected to the 
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same set of sniffers, and are thus "indistinguishable" 
in the binary sniffer observations, we can effectively 
view them as a single user with active probability 
1 — (1 — p u )(l — p v ) if users are active independently, 
or p u + p v if only one user can be active at a time (e.g., 
due to CSMA). 

5.2 Inference of unknown G and p using binary ICA 

In this section, we derive methods to estimate the un- 
known mixing matrix G and the active probability vector 
p. Consider again the example in Figure [T] Let sniffers s± 
and S2 be assigned to channel 1 and observe the activity 
of a single user y±. In this case, x% = x^. Therefore, 

7>(x) = V( X1 )I {X1=X2} 
= P(yi = xi) 

Pi, X\ = 1 

1-pi, a:i = 

Therefore, if the joint distribution of X/- is the product 
of a marginal distribution with an indicator function, 
and the two marginal distributions are identical, we can 
infer that both sniffers observe the same set of users. 
Generally, the joint distribution of x preserves a certain 
stochastic "structure" of the user's activities. We will 
formalize this observation in the subsequent section by 
devising two inference methods to estimate G and p 
from V{x). 

5.2. 1 Quantized Linear ICA (qlICA ) 
First we will estimate G by applying the classic ICA 
on the binary data followed by a quantization process. 
Then V(y) can then be calculated by solving a quadratic 
programming problem. 

Estimation of G: The problem is similar to what 
was addressed by the Independent Component Analysis 
(ICA) scheme Ull, where the observed data is expressed 
as a linear transformation of latent variables that are 
non-Gaussian and mutually independent. Classic ICA 
assumes that both y and x are continuous random 
variables and that x is the outcome of a linear mixing of 
y, and thus is not directly applicable to our problem. 
We adopt the algorithm presented in Il32l with some 
modifications. The basic idea is as follows. 

We first observe that (7} can be simplified using linear 
mixing and a (coordinate-wise) unit step function. 



A is the diagonal scaling matrix with Xn = maxstep(^), 
where ii is the z'th column of L, and 



x = U(Gy), 



(12) 



where U(-) is a unit step function defined by U(r) = 
I{r>o}- By applying the standard ICA on x, we can 
"decompose" the observation to x k Ls, with L is the 
linear mixing matrix and s is the collection of random 
sources. However, both L and s are not the solutions 
to our problem since they contain fractional values. 
Therefore, we quantize L to get the inferred binary mixing 
matrix G as follow, 



maxstep(r) 



max(r) if |max(r)| > | min(r) 
min(r) otherwise. 



(14) 

A scales the elements in the mixing matrix to the max- 
imum value 1. The matrix T contains thresholds, such 
that the higher the threshold value, the sparser G is. 

Estimation of 'P(y): Once G is determined, V(y) needs 
to be estimated. From Xi = U(cnyi), where gi is the i'th 
row of G (i.e., the estimated coverage vector of sniffer 
Si), we have, 



p(x t =o) = n P ( yj =o). 



(15) 



The product is due to the independence of y/s. Taking 
log(-) on both sides, we have 

io g (p(x l = o)) = lo ^yj = °))- ( 16 ) 

Let cti — \og(p(xi = 0)), and Pi — log(p(yj = 0)). Define 
a = {a 1: a 2 , ■ ■ ■ ,a m } T , and /3 = {fix, /3 2 , . . . , /3 n } T . We 
can calculate p(yj — 0) (and consequently obtain 'P(y)) 
by solving the following optimization problem. 



min ||a-G/3|| 2 



s.t. /3 < 0, 



(17) 



where || • || is the second norm of a vector. The objective 
function minimizes the distance between the real obser- 
vation vector x and its reconstructed counterpart (G(3). 
Clearly, this is a constrained quadratic programming 
problem with a positive semi-definite matrix (i.e., all 
eigenvalues are non-negative), and can be solved in 
polynomial time. 

Channel selection: With the estimated p and G at hand, 
we effectively transform the sniffer-centric model to the 
user-centric model. Methods described in Section 1431 can 
then be applied to determine the channel assignment of 
each sniffer. QuantizeICA algorithm which infers p and 
G is presented in Algorithm 111 and the complete QLlCA 
scheme is illustrated in Figure |3ja). 

Algorithm 1: Quantized linear ICA inference 

QuantizeICA (X) 

input : Data matrix X mX T 

init : T = threshold matrix; 

1 L = mixing matrix obtained by applying ICA on X; 

2 A = diagonal scaling matrix calculated from L; 

3 G = UCLA" 1 - T); 

4 Calculate a from X with on = log(p(x,; = 0)); 

5 Obtain p by solving the quadratic programming problem in jl7| ; 

6 output: p and G 



G = UCLA" 1 



(13) 



9 



Linear ICA 



Quantization 



G,x P(y) 

1\ Quadratic I 

/ Programming — 



MEC 



(a) Quantized Linear ICA 



G,P(y) 



Binary ICA 



MEC 



(b) Binary ICA 

Fig. 3: Channel selection algorithm under QLlCA and BlCA models 



A toy example: We next give a simple example, which 
provides insight as to the operations of QLlCA. Let 
us reconsider the network in Figure [T] with u\ and u-i 
operate on one single channel. With T = 10 observations, 
supposedly we have the activity matrix 



Uij = 1 indicates that user yi is active on the channel 
at time slot j. Y is hidden and unknown to us. Since 

, we have the observation matrix 



G = 



1 1 

X = 



Applying the linear ICA, we obtain 



L = 



0.30 -0.35 
-0.20 -0.35 



,A- X = 



-2.89 






-2.89 



With threshold T = 0.5, solving the equation | [T3| and 
the optimization problem (P7) , we have the following 
inferred results. 



G 



1 

1 1 



,p = {0.5,0.2}. 



Inferred results G and p are actually permutations of 
the original mixing matrix G and the active probability 
p. We see that QLlCA can successfully infer information 
regarding the underlying model from 'P(x). 



5.2.2 Binary ICA (BlCA) 

Instead of applying a quantization process on the result 
of linear ICA, we can apply the BlCA algorithm pro- 
posed in [13] to determine V(y) and G by exploiting the 
OR mixture model between y and the observation vari- 
able x. Compared with QLlCA, BlCA explicitly account 
for the generative model and thus leads to more accurate 
estimation results. However, this comes at the expense of 
higher computation complexity. In the worst case, given 
m sniffers, the run time of the algorithm is 0(m2 m )|^] 
For completeness, we first define some notation and then 
outline the BlCA algorithms. 

4. Several techniques are suggested in 1 13 [ to reduce the computation 
complexity. 



Joint estimation of G and V(y): The basic idea of 
BlCA algorithm is as follow: given an observation matrix 
X from m sniffers, we will first assume that there exists 
at most 2 m distinguishable users. Each user is repre- 
sented in G by a unique column e {0, l}" 1 indicating 
its connections to m sniffers. 2 m users are ordered by 
ascending values of their corresponding columns. We 
will recursively construct 2 submatrices from X such 
that the first submatrix captures the joint distribution 
of activities from the first 2 m users (in product form 
due to the independence assumption). If the submatrix 
is small enough, the join distribution can be inferred 
directly, otherwise, a divide-and-conquer approach is 
taken. Once the join distribution of the first 2 m ~ 1 users 
are available, that of the remaining 2 m_1 users can be 
inferred from the second submatrix. 

Next, we present in detail the proposed BlCA al- 
gorithm. Let define X^_n xT to be a submatrix of 
X, where the rows correspond to observations of 
xi, X2, ■ ■ . , Xh-i for t = 1,2, ...,T such that Xht = 0, 
i.e., the first submatrix mentioned above. Also, define 
X(h-i)xT t° be the matrix consisting the first h — 1 
rows of X, i.e., the second submatrix. Let IF(.) be the 
frequency function of some event, we have the iterative 
inference algorithm as illustrated in Algorithm [2] 

When the number of observation variables rn = 1, 
there are only two possible unique sources, one that can 
be detected by the monitor x\, denoted by [1]; and one 
that cannot, denoted by [0]. Their active probabilities 
can easily be calculated by counting the frequency of 
[x\ — 1) and (x\ = 0) (lines [T]-|3|. If m > 2, p and G are 
estimated through a recursive process. -X"( m _i) X T is sam- 
pled from columns of X that have x m = 0. If -X^( to _i)xt 
is an empty set (which means x mt = l,Vt) then we can 
associate x m with a constantly active component and set 
the other components' probability accordingly (lines |4j- 
If X" m _ 1 - )xT is non-empty, we invoke FindBICA on 
two sub-matrices X® m _ 1 - jxT and -X"( m _i) X T to determine 
Pi „2to-i and p' t 2m -i, then infer P2 m - 1 +i...2 m (lines [8]- 
Il2l . Finally, ph and its corresponding column in G are 
pruned in the final result if p h < e (lines 13 -[15). 



Channel selection: Now that p and G are inferred, we 
again transform the sniffer-centric model to the user- 
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Algorithm 2: Incremental binary ICA inference 

FindBICA (X) 

input : Data matrix X mx x 

init : n = 2 m - 1; 

p = 1 x n zero vector; 

G = m x (2 m — 1) matrix with rows corresponding all possible 
binary vectors of length m; 

e - the minimum threshold for p^ to be considered a real 
component; 

1 if m = 1 then 

2 I pi = T(x! = 0); 

3 I p2= F(x\ = 1); 
else 

: then 

FindBICA (X (m _ 1)xr ); 
1; 

m = 0; 



10 

n 



if Y° 

ltA (m-l)xT 



else 



Pi. 
P 2 „ 



. 2 m-l 

-!+l 
-1+2.. 



Pi. 
P' 



for Z = 2. 

|_ Pi+2' 



= FindBICA (X» m _ 1)xT ); 
= FindBICA (X (m _ 1)xT ); 
. . . ,2 m " 1 do 
,-i = 1 ■ 



P 2 " 



n !=1 . 



i-pi' 



-i]) 



-1,^2" 



+ 1 (1-Pi)' 



for 7i = 1, . . . , 2 m do 

if (p h < e) V (p h = 0) then 
I prune ph and corresponding column g^; 



16 output: p and G 



an extensive evaluation on performance of QLlCA and 
BlCA. 



6 Simulation Validation 

In this section we evaluate the performance of differ- 
ent algorithms under the user-centric and sniffer-centric 
models using both synthetic and real traces. Synthetic 
traces allow us to control the parameter settings while 
real network traces provide insights on the performance 
under realistic traffic loads and user distributions. 

In addition to the Greedy and LP-based algorithm, 
we also consider Max Sniffer Channel (Max) where a 
sniffer is assigned to its busiest channel. This scheme is 
the most intuitive approach in practical networks where 
the user model is not available and sniffers have to de- 
cide their channel assignment non-cooperatively based on 
local observations. Note it is easy to construct scenarios 
where Max performs arbitrarily bad. Thus, its worst case 
performance is unbounded. For the inference scheme in 
the QLlCA model, we used the FastICA algorithm ||12| 
to compute the linear mixing matrix L. 

6.1 QoM under different models 

6.1.1 Synthetic traces 



centric model and apply methods Section 4.3 to find op- 
timal sniffer-channel assignment scenario. The complete 
BlCA scheme is illustrated in Figure [3j 



A toy example: Again, let us reconsider the network 
in Figure [l] Recall that p = {0.2,0.5}, T = 10 and the 
observation matrix is 



X = 



0010000010 
110 11110 



Following the FindBICA algorithm, we will first de- 
compose X into X lxT and X\ x t- 

X° lxT = [0 0] , 

X lxT =[0 01000001 0]. 

From X° xT , we have po = 1 and pi = 0. Similarly 
from XixT, we have p' = 0.8 and p^ — 0.2. Applying the 
procedure in Algorithm |2j we have the inferred results: 



G = 



1 




,p = {1,0,0.5,0.2}. 



The first and second column of G and p will be pruned 
since we are not interested in components that cannot 
be observed (with gi = 0) or components with their 
active probabilities smaller than e (e is set to be 0.01 
in this case). Thus yields the final result to be exactly 
equal to the ground truth. Toy example shows that both 
QLlCA and BlCA can accurately predict the underlying 
model without priory knowledge on the latent variables' 
activities in simple cases. The next section will provide 
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Fig. 4: Hexagonal layout with users ('+'), sniffers (solid dots), 
base stations (triangles), and channels of each cell (in different 
triangle colors) 



In this set of simulations, 500 wireless users are placed 
randomly in a 500 x 500 square meter area. The area 
is partitioned into hexagon cells with circumcircle of 
radius 86 meters. Each cell is associated with a base 
station operating in a channel (and so are the users 
in the cell). The channel to base station assignment 
ensures that no neighboring cells use the same channel. 
25 Sniffers are deployed in a grid formation separated 
by distance 100 meters, with a coverage radius of 120 
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(a) 3 channels (b) 6 channels (c) 9 channels 

Fig. 5: QoM under three models: the user-centric model (User), QLlCA and BlCA with 3, 6 and 9-channel synthetic traces 



meters. A snap shot of the synthetic deployment is 
shown in Figure [4] The transmission probability of users 
is selected uniformly in (0, 0.06], resulting in an average 
busy probability of 0.2685 in each cell. Threshold T for 
QLlCA is set at 0.5 and threshold e for BlCA is set at 
0.01. We vary the total number of orthogonal channels 
from 3 to 9rj The results shown are the average of 20 
runs with different seeds. 

Figure [5] shows QoM calculated by three algorithms 
(Max, Greedy and LP-Round) and the theoretical upper 
bound (LP-Up) on two models using synthetic traces 
of 3, 6, 9 channels, respectively. Results of the user- 
centric model are shown in solid lines while results of 
different inference algorithms (e.g., QLlCA and BlCA) 
in the sniffer-centric model are shown in dotted and 
dashed lines, respectively. In the user-centric model, one 
can see that the performance of Greedy and the LP- 
based algorithm with random rounding are comparable 
to LP-Up, and both outperform Max in all three traces. 
Recall that according to Max, a sniffer non-cooperatively 
decides its own channel assignment and selects the most 
active channel. Clearly, Max does not take into account 
the correlations among the observations of neighboring 
sniffers in the same channel. In contrast, in the sniffer 
centric case, the proposed inference algorithms can in- 
deed extract such a correlative structure from the binary 
observations as shown by their superior performance 
over Max. 

Additionally, we observe that the expected number of 
users monitored by the algorithms using BlCA is higher 
than that of QLlCA and is very close to that attained 
in the user centric model (where we assumed to have 
complete knowledge of users' activities and their rela- 
tionship to sniffers). This indicates that BlCA algorithm 
indeed produces inferred models that are very close to 
the ground truths. Having a good estimation of G^Jand p 
as the input, Greedy and LP-Round can produce channel 
assignments whose performance is close to LP-Up. 

5. In 802.11a networks, there are 8 orthogonal channels in 5.18- 
5.4GHz, and one in 5.75GHz. 

6. A predicted user in G is actually the aggregation of real users 
in a unique sniffer coverage area since we simply cannot distinguish 
between different users that can only be monitored by the same set of 
sniffers. 



We further note that by comparing results from Fig- 
ure |5ja) to Figure (5|c), the QoM metric reduces as 
the total number of channels increases for all schemes, 
including LP-Up. This is due to the fact that users scatter 
over more channels, and a fixed number of sniffers is no 
longer sufficient to provide good coverage. 

6.1.2 Real traces 

In this section, we evaluate our proposed schemes us- 
ing real traces collected from the UH campus wireless 
network using 21 WiFi sniffers deployed in the Philip 
G. Hall. Over a period of 6 hours, between 12 p.m. 
and 6 p.m., each sniffer captured approximately 300,000 
MAC frames. Altogether, 655 unique users are observed 
operating over three channels^] The number of users 
observed on WiFi channels 1, 6, 11 are 382, 118, and 155, 
respectively. The histogram of user active probability 
(calculated as the percentage of 20/is slots that a user 
is active) is shown in Figure [7] Clearly, most users are 
active less than 1% of the time except for a few heavy 
hitters. The average user active probability is 0.0014. 

Figure [6] gives the average number of active users 
monitored under the user-centric model, and under the 
models inferred by QLlCA and BlCA. The number 
of sniffers in the experiments varies from 5 to 21 by 
including only traces from the corresponding sniffers. 
The number of channels is fixed at 3. Except for the 
case with 21 sniffers, all data points are averages of 5 
scenarios with different sets of sniffers, chosen uniformly 
at random. Recall that the average active probability is 
0.0014. Thus, for the best channel assignment scenario, 
the QoM on all channels is around 1. In the user- 
centric case (Figure |6|a)), both Greedy and LP-Round 
significantly outperform Max (by around 50%). More- 
over, their performance is comparable with LP-Up. As 
the number of sniffers increases, the average number of 
users monitored increases but tends to flatten out since 
most users have been monitored. 

In the sniffer-centric case, similar trends can be ob- 
served when G and V(y) are inferred using QLlCA 
and BlCA (Figure |6jb)(c)). BlCA outperforms QLlCA in 

7. Our measurements used the campus IEEE 802.11g WLAN, which 
has three orthogonal channels. 
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Fig. 6: QoM under the user-centric and sniffer-centric models with real WiFi traces. In the user-centric model, the results of 
LP-round coincide with that of the LP-Up. In some cases, the confidence interval is quite small and is thus not observable in 
the figures. 
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Fig. 7: Histogram of user active probability measured as the 
percentage of active 20/iS slots. The average active probability 
is 0.0014. 



general. However, there exists some performance gap 
in both cases due to the loss of information, when 
compared with the user-centric model. The real WiFi 
traces, in contrast to the synthetic scenarios, contain a 
large number of observations and many "mice" users 
(users with very low active probability). Most of the 
time, these users will be removed (since pi < e), causing 
higher prediction errors in G and p. 

6.2 Comparison of qlICA and bICA under the 
sniffer-centric model 

To understand the performance difference of QLICA and 
BlCA, in this section, we provide a detailed comparison 
of the inferred G and V{y) in both schemes. 

Performance metrics: We denote by G and p the in- 
ferred adjacency matrix and the inferred active proba- 
bility of users, respectively. Two metrics are introduced 
to measure the accuracy of the inferred quantities. 



Structure Error Ratio. This metric indicates how 
accurate the adjacency matrix is estimated. It is 
defined by the Hamming distance between G and 
G divided by the size of the matrix. 



H(G, G) 



A 



(18) 



0.14 



Due to the possible difference in the number and the 
order of inferred independent components in G and 
G, we need to perform the structure matching process 
before estimating H(G, G). Details of the algorithm 
can be found in lTT3l . 

Transmission Probability Error. The prediction er- 
ror in the inferred transmission probability of inde- 
pendent users is measured by the Kullback-Leibler 
divergence between two probability distributions p 
and p. Let p' and p' denotes the "normalized" p and 
P (Pi — PiY^i=\Pi)t Transmission Probability Error 
is defined as below: 



np',p') = ELiftiog(|v) 



(19) 



Intuitively, Transmission Probability Error gets 
larger as the predicted probability distribution p is 
more deviated from the real distribution p. 

Results: In this set of experiments, 10 sniffers and n 
users are deployed on an 1 , 000 x 1 , 000 square meter area, 
with n varying from 5 to 20. Sniffers are placed randomly 
on the area with the coverage radius set to 100 meters. In 
each run, different sets of user locations are arbitrarily 
chosen. Only placements satisfying the restriction that 
no two users are observed by a same set of sniffers are 
included in the simulation. User transmission probability 
is selected randomly in (0, 0.06]. All users and sniffers 
operate on a same wireless channel since we are only 
interested in the accuracy of the inferred G and V{y). 
The size of sample data T = 10,000. Results are the 
average of 20 different runs. 

From Figure [§J we see that BlCA can achieve lower 
prediction errors than QLICA on both G and V{y). The 
former is not very sensitive to the number of users, while 
the performance of QLICA degrades as the number of 
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users increases. This is somewhat expected as QLlCA is 
fundamentally a linear ICA method. Additionally, the 
estimation of V(y) in QLlCA only utilizes first-order 
statistics. In contrast, BlCA is a joint procedure designed 
specifically for binary data following disjunctive gener- 
ation models. 

7 Discussion 

In this section, we discuss several practical considera- 
tions in implementing the proposed algorithms in real 
systems for wireless monitoring. 

The primary focus of this work is sniffer-channel as- 
signment given fixed sniffer locations. Sniffer placement 
has been addressed in ||T9] , which assumes worse case 
loads in the network, while sniffer-channel assignment 
can be made based on the actual measured loads. In fact, 
both problems can be considered in a single optimiza- 
tion framework if we generalize the sniffer placement 
problem to decide online which set of sniffers should be 
turned on given budget constraints. 

Implementation of sniffer-channel assignment should 
incorporate the learning procedure proposed in [11 1. 
The time granularity of channel assignment should be 
sufficiently long to amortize the cost due to channel 
switching. To allow a consistent view of the channel at 
different locations, clock synchronization across multiple 
sniffers is needed. While clock synchronization can be 
performed offline using the frame traces collected 0, 
the accuracy of clock synchronization directly affects the 
inference accuracy of the ICA based methods in the 
sniffer-centric model. The choice of the slot of the binary 
measurements shall be made that takes into account the 
persistence of user transmission activities. 

The channel assignment in its current form is com- 
puted in a centralized manner. This is reasonable since 
the sniffers are likely operated by a single administrative 
domain. An alternative distributed implementation has 
been considered in 1 33 J for the user-centric model based 
on the annealed Gibbs sampler. However, parameters of 
the distributed algorithm need to be properly tuned for 
fast convergence (and hence less message exchanges). 
From our understanding, the sniffer-centric model is not 
immediately amiable to distributed implementation. 

8 Conclusion 

In this paper, we formulated the problem of maximizing 
QoM in multi-channel infrastructure wireless networks 
with different a priori knowledge. Two different models 
are considered, which differ by the amount (and type) of 
information available to the sniffers. We show that when 
complete information of the underlying cover graph and 
access probabilities of users are available, the problem 
is NP-hard, but can be approximated within a constant 
factor. When only binary information about the channel 
activities is available to the sniffers, we propose two 
approaches (QLlCA and BlCA) so that one can map the 
problem to the one where complete information is at 
hand using the statistics of the sniffers' observations. 



We further conducted a detail study comparing the 
performance of QLlCA and BlCA. Finally, evaluations 
demonstrate the effectiveness of our proposed inference 
methods and optimization techniques. 
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